Universal Access To All Knowledge
Home Donate | Store | Blog | FAQ | Jobs | Volunteer Positions | Contact | Bios | Forums | Projects | Terms, Privacy, & Copyright
Search: Advanced Search
Anonymous User (login or join us)
Upload

Reply to this post | See parent post | Go Back
View Post [edit]

Poster: aibek Date: Dec 31, 2012 6:32am
Forum: texts Subject: Re: PDF's on Amazon Kindle

First about the file.

Every image that we see in this is made up from 3 images: a background, a foreground and a mask. So the pdf really contains not 116, but 348 images. The foreground and background are in RGB, the mask is BW (1 bit per sample).

To see why this is useful, imagine brown text on yellow page. If such a 1000x1000 sample image is saved in RGB (3 bytes per sample), it would before compression take 3,000,000 bytes. In the IA method the image is decomposed into 3 parts: a brown 1000x1000 image, a yellow 1000x1000 image, and a BW 1000x1000 mask, and each of them is compressed and saved separately in the PDF. It is the reader’s work to join them together. '0' in the mask image means the reader is to compose the final image by using the corresponding pixel of the foreground image; '1' means that the reader is to use the corresponding pixel of the background image.

This helps as the three can be compressed much better. At an extreme, you can always imagine the brown and the yellow images taking just a few bytes each: 3 bytes for recording the colour, and a few bytes more for recording the dimensions of the image. Also, our most critical data is in the mask -- that image has to be the sharpest. But that has now 1 bit samples, and not the 27 bytes we earlier had.

There are a few more details. First, the background image is saved at a lower resolution, and the reader is asked to interpolate. So a little, if insignificant, loss in quality is creeping in the IA "compression". Second, the background and the foreground images are not limited to one colour -- they can be full-fledged RGB images too. The point is that the background image will fill surrounding colour in the place where the (foreground) objects are -- those parts will not be used anyway -- and thus will have a more or less uniform colour or gradient. And similarly, the foreground image will fill the nearby colours in the place in which it will not be read. This way the compression is much better. The critical stuff is (i) identifying the foreground and background properly, (ii) filling colours in areas which will not be read so that the image can be compressed best by the compression method of choice ('JPXDecode' for the RGB images in IA files). IA’s PDF files are produced by 'LuraDocument PDF v2.28'.

The compression is pretty significant. In the attached page 2 of the Devises et Emblemes file, the three images together take 42 KB. The dimensions of the composed image, however, are 2201x3063, so RGB (3 bytes per sample) would take 19 MB! This is the size of the file you will get when you join the three together in the intended manner.

So, most likely, Kindle is refusing to do this work. Please check for the text-layer issue too. The attached orig-p.2.pdf is the p. 2 of the book. The minus-textlayer.pdf is that page minus the text layer. (Both the files are of 40 KB size.) I am assuming that your Kindle can read neither of the files.

Reply to this post
Reply [edit]

Poster: aibek Date: Dec 31, 2012 6:38am
Forum: texts Subject: Re: PDF's on Amazon Kindle

I forgot to add the files in the previous post.

Attachment: orig-p.2.pdf
Attachment: minus-textlayer.pdf

Reply to this post
Reply [edit]

Poster: stbalbach Date: Jan 1, 2013 2:19pm
Forum: texts Subject: Re: PDF's on Amazon Kindle

I tried both and same error about unable to display elements and shows a blank page. You are probably correct the problem is these readers don't have the CPU for the compression and/or memory.

Reply to this post
Reply [edit]

Poster: aibek Date: Jan 5, 2013 7:52pm
Forum: texts Subject: Re: PDF's on Amazon Kindle

Btw, the pdf reader is supposed to paint two images, one on the top of another: the background image in the background, and the foreground+mask image (i.e. a cut-out of the foregound image) over it. So it is really due to the presence of layers of images as you initially suspected.